Goto

Collaborating Authors

 update equation


Robust volatility updates for Hierarchical Gaussian Filtering

arXiv.org Machine Learning

Hierarchical Gaussian Filtering (HGF) networks allow for efficient updating of posterior distributions (beliefs) about hidden states of an agent's environment. HGF parent nodes can target the mean or variance of their children. New information entering at input nodes leads to a cascade of belief updates across the network according to one-step update equations for each node's mean and precision (inverse variance). However, the original form of the update equations for variance-targeting parents(volatility coupling) can in some regions of parameter space lead to negative posterior precision, a logical impossibility which causes the updating algorithm to terminate with an error. In this report, we introduce a modified quadratic approximation to the variational energy of volatility-coupled nodes that avoids negative posterior precision. The key idea is to interpolate between two quadratic expansions of the variational energy: one at the prior prediction and one at a second mode whose location is obtained in closed form via the Lambert W function. The resulting update equations are robust across the entire parameter space and faithfully track the variational posterior even for large prediction errors.









V ai Phy: a Variational Inference Based Algorithm for Phylogeny Appendix A The V aiPhy Algorithm

Neural Information Processing Systems

The update equations of V aiPhy follow the standard mean-field VI updates. Furthermore, i is the set of nodes except node i, and C is a constant. We utilize the NJ algorithm to initialize V aiPhy with a reasonable state. An example script to run PhyML is shown below. Here we provide two algorithmic descriptions of SLANTIS.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

The authors prove that variational inference in LDA converges to the ground truth model, in polynomial time, for two different case studies with different underlying assumptions about the structure of the data. In this analysis, the authors employ "thresholded" EM updates which estimate the per-topic word distribution based on the subset of documents where a given document dominates. The proofs, which are provided in a 35-page supplement, require assumptions about the number of words in a document that are uniquely associated with each topic, the number of topics per document, and the number documents in which a given word exclusively identifies a topic. I am not enough of a specialist to evaluate the provided proofs in detail, so I will restrict myself to relatively high level comments. Empirically speaking, variational inference can and does get stuck in local maxima.